Simultaneous Alignment and Folding of Protein Sequences

نویسندگان

  • Jérôme Waldispühl
  • Charles W. O'Donnell
  • Sebastian Will
  • Srinivas Devadas
  • Rolf Backofen
  • Bonnie Berger
چکیده

Accurate comparative analysis tools for low-homology proteins remains a difficult challenge in computational biology, especially sequence alignment and consensus folding problems. We present partiFold-Align, the first algorithm for simultaneous alignment and consensus folding of unaligned protein sequences; the algorithm's complexity is polynomial in time and space. Algorithmically, partiFold-Align exploits sparsity in the set of super-secondary structure pairings and alignment candidates to achieve an effectively cubic running time for simultaneous pairwise alignment and folding. We demonstrate the efficacy of these techniques on transmembrane β-barrel proteins, an important yet difficult class of proteins with few known three-dimensional structures. Testing against structurally derived sequence alignments, partiFold-Align significantly outperforms state-of-the-art pairwise and multiple sequence alignment tools in the most difficult low-sequence homology case. It also improves secondary structure prediction where current approaches fail. Importantly, partiFold-Align requires no prior training. These general techniques are widely applicable to many more protein families (partiFold-Align is available at http://partifold.csail.mit.edu/ ).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Structural Characteristics of Stable Folding Intermediates of Yeast Iso-1-Cytochrome-c

Cytochrome-c (cyt-c) is an electron transport protein, and it is present throughout the evolution. More than 280 sequences have been reported in the protein sequence database (www.uniprot.org). Though sequentially diverse, cyt-c has essentially retained its tertiary structure or fold. Thus a vast data set of varied sequences with retention of similar structure and fun...

متن کامل

A max-margin model for efficient simultaneous alignment and folding of RNA sequences

MOTIVATION The need for accurate and efficient tools for computational RNA structure analysis has become increasingly apparent over the last several years: RNA folding algorithms underlie numerous applications in bioinformatics, ranging from microarray probe selection to de novo non-coding RNA gene prediction. In this work, we present RAF (RNA Alignment and Folding), an efficient algorithm for ...

متن کامل

A generalization of Profile Hidden Markov Model (PHMM) using one-by-one dependency between sequences

The Profile Hidden Markov Model (PHMM) can be poor at capturing dependency between observations because of the statistical assumptions it makes. To overcome this limitation, the dependency between residues in a multiple sequence alignment (MSA) which is the representative of a PHMM can be combined with the PHMM. Based on the fact that sequences appearing in the final MSA are written based on th...

متن کامل

An Evolutionary Relationship Between Stearoyl-CoA Desaturase (SCD) Protein Sequences Involved in Fatty Acid Metabolism

Background: Stearoyl-CoA desaturase (SCD) is a key enzyme that converts saturated fatty acids (SFAs) to monounsaturated fatty acids (MUFAs) in fat biosynthesis. Despite being crucial for interpreting SCDs’ roles across species, the evolutionary relationship of SCD proteins across species has yet to be elucidated. This study aims to present this evolutionary relationship based on amino aci...

متن کامل

A Novel Genetic classification of SARS coronavirus-2 following whole nucleic acid and protein alignment of the isolated viruses

Background and aims: The end of 2019 has marked the year, which the human population encountered a novel virus; SARS-CoV-2 that causes a disease namely COVID-19. Here we focused on the genome and protein mutations and subsequently suggested a new classification of the SARS-CoV-2. Materials and Methods: Our study showed that some extra positions in the virus genome play a key role in the SARS-C...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of computational biology : a journal of computational molecular cell biology

دوره 21 7  شماره 

صفحات  -

تاریخ انتشار 2009